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. P. , Abstract 

A pseudo-random number generator (RNG) might be used to generate w-hit 
random samples in d dimensions if the number of state bits is at least dw. 
Some RNGs perform better than others and the concept of equidistribution 
has been introduced in the literature in order to rank different RNGs. 

We define what it means for a RNG to be {d, t(;)-equidistributed, and then 
argue that (d, tt;)-equidistribution is not necessarily a desirable property. 
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1 Motivation 

There is no such thing as a random number - there are only 
methods to produce random numbers, and a strict arithmetic pro- 
cedure of course is not such a method. 

John von Neumann [111 p. 768] 

Suppose we are performing a simulation in d dimensions. For simplicity 
let the region of interest be the unit hypercube H = [0, l)*^. 

For the simulation we may need a sequence yQ,yi, . . . of points uniformly 
and independently distributed in H. A pseudo-random number generator 
gives us a sequence xq, xi, . . . of points in [0, 1). Thus, it is natural to group 
these points in blocks of d, that is 

Vj — i^jdj Xjd+l, . . . , Xjd+d-l) ■ 

If our pseudo-random number generator is good and d is not too large, we 
expect the yj to behave like uniformly and independently distributed points 
inH. 

2 Pseudo-random vs quasi-random 

We are considering applications where the (pseudo-)random number gener- 
ator should, as far as possible, be indistinguishable from a perfectly random 
source. In some applications, e.g. Monte Carlo quadrature, it is better to 
use quasi-random numbers which are intended for that application and give 
an estimate with smaller variance than we could expect with a perfectly 
random source. 

For example, when estimating a contour integral of an analytic function, 
we might transform the contour to a circle and use equally spaced points on 
the circle. 

However, when simulating Canberra's future climate and water supply, 
it would not be a good idea to assume that exceptionally dry years were 
equally spaced! 

3 Goodness of fit 

If we use the x^ test to test the hypothesis that a set of data is a random 
sample from some distribution, then we typically reject the hypothesis if the 
X^ statistic is too large. 



However, we should equally reject the hypothesis if x^ is too small (be- 
cause in this case the fit is too good) [9\. 

4 Linear congruential generators 

In the "old days" people often followed Lehmer's suggestion and used linear 
congruential random number generators of the form 

Zn+i = azn + b mod m . 

This gives an integer in [0, m) so needs to be scaled: 

Typically m is a power of two such as 2^^ or 2^^, or a prime close to such a 
power of two. 

Unfortunately, all such linear congruential generators perform badly in 
high dimensions, as shown in Marsaglia's famous paper Random num,hers 
fall m,ainly in the planes 0. 

5 RANDU 

Some linear congruential generators perform disastrously. For example, con- 
sider the infamous RANDU: 

Zn+i = 65539zn mod 2 
(with zq odd). These points satisfy 

Zn+2 - QZn+1 + S-^n = mod 2^^ 

SO in dimension d = 3 the resulting points yj all lie on a small number of 
planes, in fact 15 planes separated by distance l/Vl^ + 6^ + 9^ ss 0.092 

In general, such behaviour is detected by the spectral test [6]. 

Even the best linear congruential generators perform badly because they 
have period at most m, so the average distance between points yj is of order 
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(so the set of points closest to any one yj has volume of order 1/m). 



6 Modern generators 

Nowadays, linear congruential generators are rarely used in high-dimensional 
simulations. Instead, generators with much longer periods are used. A 
popular class is those given by a linear recurrence over F2. These take the 
form 

Ui = Aui^i mod 2 

Vi = Bui mod 2 

w 

i=i 
where Uj is an n-bit state vector, Vi is a w-hit output vector which may be 
regarded as a fixed-point number Xj, and the linear algebra is performed over 
the field F2 = GF(2) of two elements {0, 1}. Here ^ is an n x n matrix and 
B is awxn matrix (both over ^2)- Usually A is sparse (so the matrix- vector 
multiplication can be performed quickly) and often B is a projection. 

7 The period 

Provided the characteristic polynomial of A is primitive over F2, and B ^ 0, 
the period of such a generator is 2" — 1. This can be very large, e.g. n = 4096 
for xorgens [3] and n = 19937 for the Mersenne Twister [8]. For details we 
refer to L'Ecuyer's papers [5tll2j. 

8 Equidistribution 

Various definitions of (d, t(;)-equidistribution can be found in the literature. 
We follow Panneton and L'Ecuyer [12J without attempting to be too general. 

Consider w-hit fixed-point numbers. There are 2*" such numbers in [0, 1). 
Each such number can be regarded as representing a small interval of length 

Similarly, in d dimensions, we can consider small hypercubes whose sides 
have length 2~^. Each small hypercube has volume 2~ and there are 2 
of them in the unit hypercube [0, 1)'^. A small hypercube can be specified 
by a d-dimensional vector of w-bit numbers (a total of dw bits). 



Definition 

Consider a random number generator with period 2". (A slight change in 
the definition can be made to accomodate generators with period 2" — 1.) 

If the generator is run for a complete period to generate 2"' pseudo- 
random points in [0, 1)'^, we say that the generator is (d, t/;)-equidistributed 
if the same number of points fall in each small hypercube. 

The condition n > dw is necessary. The number of points in each small 
hypercube is 2"^"'^"'. 

RANDU (with n = 29) is not (d, w)-equidistributed for any d> 3, w > 4. 
However, most good long-period generators are (d, w)-equidistributed for 
dw <C n. 

9 Figures of merit 

The maximum w for which a generator can be (d, ti')-equidistributed is w'^ = 
[n/d\ . If a generator is actually (d, w)-equidistributed for w < Wd then 

6d = w*a- Wd 

is sometimes called the "resolution gap" [5] and 

A = max 5^ 

d<n 

is taken as a figure-of- merit (small A is desirable). However, this only makes 
sense when comparing generators with the same period. When comparing 
generators with different periods, it makes more sense to consider 

d<n 

as a figure of merit (a large value is desirable). An upper bound is W < 
T.dW*d^rilnn. 

10 Problems with equidistribution 

A test for randomness should (usually) be passed by a perfectly random 
source. 

(d, tt;)-equidistribution applies only to a periodic sequence: we need to 
know the period A^ = 2" (or A^ = 2" — 1). A perfectly random source 



is not periodic, but we can get a periodic sequence by taking the first N 
elements (yo,yi, • • • ,yN-i) and then repeating them {yi+N = Vi)- However, 
this sequence is unhkely to be (d, it;)-equi-distributed unless d and w are 
very small. 

Consider the simplest case dw = n. There are A^ = 2" small hyper- 
cubes and A^! ways in which each of these can be hit by exactly one of 
(yo, • • • , Vn-i) out of N possibilities. Thus the probability of equidistribu- 
tion is 

'W ^ exp(iV) ■ 

Recall that A^ = 2" is typically very large (for example 2^^^^^) so exp(A^) is 
gigantic. 

Independence of ordering 

(d, ti;)-equidistribution is independent of the ordering oi yo, . . . , yN-i- 

Given a {d, t(;)-equidistributed sequence, we can reorder it in any manner 
and the new sequence will still be {d, tf )-equidistributed. 

For example, yj = j mod 2" gives a (l,n)-equidistributed sequence. 

A common argument 

It is often argued that, when n is large, we will not use the full sequence 
of length iV = 2", but just some initial segment of length M <^ N. If 
M <C vN then the initial segment may behave like the initial segment of 
a random sequence. However, if this is true, what is the benefit of {d, w)- 
equidistribution? 

11 Why consider equidistribution? 

The main argument in favour of considering equidistribution seems to be 
that, for several popular classes of pseudo-random number generators, we 
can test if the sequence is (d, t(;)-equidistributed without actually generating 
a complete cycle of length N. 

For generators given by a linear recurrence over F2, {d, w)-equidistribution 
is equivalent to a certain matrix over F2 having full rank. However, the fact 
that a property is easily checked does not mean that it is relevant. We 
actually need something weaker (but harder to check). 



12 Conclusion 

When comparing modern long-period pseudo-random number generators, 
{d, ■u;)-equidistribution is irrelevant, because it is neither necessary nor suf- 
ficient for a good generator. 
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